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Robust hypothesis testing with a relative 

entropy tolerance 

Bernard C. Levy 
Abstract 

This paper considers the design of a minimax test for two hypotheses where the actual 
probability densities of the observations are located in neighborhoods obtained by placing a 
bound on the relative entropy between actual and nominal densities. The minimax problem 
admits a saddle point which is characterized. The robust test applies a nonlinear transformation 
which flattens the nominal likelihood ratio in the vicinity of one. Results are illustrated by 
considering the transmission of binary data in the presence of additive noise. 

Index Terms 

Robust hypothesis testing, Kullback-Leibler divergence, min-max problem, saddle point, 
least favorable densities. 

I. Introduction 

Robust hypothesis testing and signal detection problems have been examined in detail over 
the last 40 years [1], [2]. The purpose of such studies is to design tests or detectors which are 
insensitive to modelling errors. Specifically, whereas standard Bayesian or Neyman-Pearson tests 
are designed for nominal observation probability distributions, their performance may degrade 
rapidly when the actual model deviates only moderately from the nominal model. To guard against 
modelling errors, a minimax framework is usually adopted for selecting tests or detectors. In this 
context, the goal is to design a test that minimizes the worst-case performance for all observation 
models in a properly specified neighborhood of the nominal model. For robust hypothesis testing, 
when the neighborhood of the nominal model under each hypothesis corresponds either to 
a contamination model or a proximity model based on the Kolmogorov metric or a variant 
thereof, Huber [2]-[4] showed that the minimax detector applies a clipping transformation to the 
nominal likelihood ratio function. The clipping effect is achieved by shifting small portions of 
the probability mass under each hypothesis to the tail sections where errors occur. This relatively 
minute shift of probability mass can result in a significant degradation in test performance. 

We adopt here a minimax formulation of the robust hypothesis testing problem of the same 
type as [2]-[4]. The only difference is that the neighborhood where the actual observation 
probability density is located under each hypothesis is formed by placing an upper bound on 
the relative entropy of the actual density with respect to the nominal density. To justify the 
choice of the relative entropy as a measure of proximity between statistical models, observe that 
Huber's work addresses primarily situations where statistical models are obtained directly from 
imperfect data, possibly contaminated by outliers. However there exists also situations where the 
densities employed in hypothesis testing are model based, arising from physical considerations, 
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possibly with a few unknown parameters which are estimated from the data. In this context, the 
relative entropy is a natural metric for model mismatch, since it provides the underlying metric 
for establishing the convergence of the expectation-maximization method [5] of mathematical 
statistics. In fact from a differential geometric viewpoint, it is argued in [6] that the relative entropy 
forms a natural 'distance' between statistical models. More recently, in the context of estimation 
and filtering it was shown in [7], [8] that minimax filters based on a relative entropy tolerance take 
the form or risk-sensitive Wiener or Kalman filters, which have well known robustness properties. 
By selecting the relative entropy as a measure of model mismatch, a risk-sensitive viewpoint was 
also adopted recenty in [9] for developing robust macroeconomic policies. Given relative entropy 
neighborhoods of the nominal densities for the two hypotheses, it is easy to verify that a saddle 
point exists for the resulting minimax hypothesis testing problem. To identify the saddle point, two 
assumptions are made. First as in [3], it is assumed that the nominal likelihood ratio function 
(LR) is monotone increasing. Second, it is required that the nominal densities under the two 
hypotheses should be symmetric with respect to each other. This allows the parametrization of 
the robust test and least-favorable densities in terms of a single parameter which can be selected 
uniquely so that the relative entropy tolerance is satisfied. The least-favorable LR is expressed 
as a nonlinear transformation of the nominal LR. But, unlike [2]-[4], the transformation is not a 
clipping transformation. Instead, it attempts to drive the LR to a value as close one as possible. 
The least-favorable densities are divided into three segments. The extreme segments are scaled 
versions of the nominal densities, where the scaling aims at shifting some probability mass to 
tails where errors occur. But the middle segment is a section of the "mid-way density" on the 
geodesic linking the two nominal densities, where the mid-way density is characterized by the 
property that it has the same relative entropy with respect to each of the nominal densities. 

The robust hypothesis testing problem we consider is also related to the worst-case noise 
detection problem examined in [10], [11], where given a binary communication system with 
additive noise, with the actual noise density located within a prespecified relative entropy bound 
of the nominal noise density, it is required to find the ML detector for the worst-case noise in the 
neighborhood of the nominal noise. Thus the difference between the problem we consider and 
[11] is that we allow the additive noise statistics to be different under each hypothesis, instead of 
forcing them to be the same. Finally, it is worth noting that [12] also examines robust hypothesis 
testing by using the relative entropy as a mismatch metric between actual and nominal densities, 
but it does so asymptotically as the number of measurements becomes infinite, so its results take 
a very different form. 

The paper is organized as follows. Section [TT] describes the minimax hypothesis problem with 
a relative entropy constraint. The saddle point of the problem is characterized in Section [Till and 
examples are presented in Section [TV] Finally, Section [V] gives some conclusions. 

II. Problem Formulation 

Consider a binary hypothesis testing problem where under hypothesis Hj, with j = 0, 1, the 
random observation Y £ R admits fj(y) as nominal probability density. The actual density gj(y) 
of Y under Hj is not known exactly and belongs to the neighborhood 



Tj = {gj : D(g j \f j ) < ej} , 



(2.1) 



where 




(2.2) 
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denotes the Kullback-Leibler (KL) divergence or relative entropy of probability densities g(y) 
and f(y). Note that the KL divergence is not a true distance since it is not symmetric, i.e., 
D(g\f) 7^ D(f\g), it does not satisfy the triangle inequality, but D(g\f) > with equality if and 
only if g = f. Also, since xln(x) is a convex function for x > 0, D(g\f) is convex in g, which 
implies that neighborhood Tj is convex for j = 0, 1. 

Let V denote the class of pointwise randomized decision rules 5(y) such that if Y = y, we 
select H\ with probability 5{y) and Hq with probability 1 — 5, where < 5{y) < 1. Clearly V 
is convex, since if 5\(y) and 62 (y) are two decision rules of V, then for < a < 1, 



5(y) = a5x(y) + (1 - a)5 2 {y) 



also belongs to V. 
Let 



/oo 
S(y)go(y)dy (2.3) 
-00 

/oo 
(I - 5{y)) gi {y)dy (2.4) 
-00 

denote respectively the probability of false alarm and the probablity of a miss for decision rule 
5 € V when the densities of Y under Hq and Hi are go an d <?i, respectively. Note that Pf{5, go) 
is separately linear in S and go . Similarly -Pm(<5, 5i) is separately linear in 5 and g\. If we assume 
that the two hypotheses are equally likely, the probability of error of 5 € V is given by 

P E (S,g ,gi) = ~[P F (6,g ) + PM(S,gi)} • (2.5) 
We seek to solve the minimax problem 

min max 30, 5i) (2.6) 

Note that Pe(S, go, gi) is linear and thus convex in 5. Similarly, it is linear and thus concave in 
go and g\. The set To x T\ is convex and compact, V is convex and since 

||<5||oo = max(%) 



for all 5 G D, P is compact with respect to the infinity norm. So according to the Von Neumann 
minimax theorem [13, p. 319], there exists a saddle point (<7q , g\ )) for the minimax problem 
(12.6b - Here <5r is the robust/minimax test, whereas g$ and 5^ are the least favorable densities in 
To x T\. The saddle point is characterized by the property 

P E ^g\,9x) > Pe(£r,<7o,<7i) > Pe(Sr, 9o, gi) (2-7) 

for all 5 £ V, go E To an d 31 € T\. 

While it is nice to know that a saddle point exists, exhibiting a test 5r and least favorable 
densities g 1 -, j = 0, 1 satisfying ( 12.71 ) is a nontrivial task. Before doing so, it is worth pointing 
out that the minimax problem (12.61 ) is of the same type as considered by Huber in [2]-[4]. The 
only difference is that the neighborhoods Tj differ from those considered in [2] which included 
contamination models or proximity models based on the Kolmogorov metric as special cases. 
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The problem (12.61 ) is also closely related to the worst-case noise detection problem considered 
in [11], where for hypotheses 

H : Y = -l + N 

H x : Y = l + N , (2.8) 

and a nominal probability density /jv(w) for noise N, it was desired to construct a minimum 
probability of error detector for the least-favorable noise density <?Ar(n) located in the KL ball 
specified by D(gw\fN) < Thus the problem (12.6b differs from the one examined in [10], [11] 
by the fact that we allow the least-favorable noise distribution to be different under hypotheses 
Hq and Hi, instead of insisting they should be the same. 



L Uv) = y -m (3-D 



III. Saddle Point Specification 

The first inequality of the saddle point characterization (12.71 ) indicates that the robust test <5r 
must be the optimum Bayesian test for the least-favorable pair {g}$,g\). So if 

9i(y) 
9o(y) 

denotes the LR function for the pair {g^,g\), we need to have 

r 1 for L L (y)>l 
Sr(v) = < arbitrary for L L (y) = 1 (3.2) 
[ for L L (y)<l. 

Consider now the second inequality of (12.71 ). Because of the form (12.51 ) of Pe($, go,9l), it is 
equivalent to 

Pf(<5r,<7o) > Pf(Sr,9o) 
Pm(&r,&) > Pm{Sr,9i) 

for all go and g\ in To and T\, respectively. 

So, given <5r, the least-favorable density g\ is obtained by maximizing Pf(^r,9o) for all 
functions go € To such that 

roc 

/(so) = / go(y)dy = i. (3.3) 



Since -Pf(<5r,<7o) is concave in go and the domain To is convex, the maximization can be 
accomplished by using the method of Lagrange multipliers [14, Chap. 5]. Consider the Lagrangian 

L(g , A, im) = P f (Sr, go) + A(e - D(g \f )) + fx(l - I{g )) 

/oo 
[5 R (y) - fi - A In (f(y))] g (y)dy + Ae + M , (3.4) 
-oo JO 

where Lagrange multiplier A > is associated to the inequality constraint D(go\fo) < £o> whereas 
multiplier \i corresponds to equality constraint (13.3b - Note that the non-negativity constraint 
9o(y) > for the density function go is not introduced explicitly, since the solution obtained 
below by maximizing L satisfies this constraint automatically. 
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The Gateaux derivative [14, p. 17] of L with respect to go in the direction of an arbitrary 
function z is given by 

V g0}Z L(g ,\,[M) = lim ^[L(g + hz,X,fj) - L(g ,X,fj,)] 



/oo 
[6r-(X + ri-Xhi (f)]zdy, (3.5) 
-oo JO 



and since z(y) is arbitrary, this implies 



8 R (y)-(X + fi)-Xln(^)(y) = 0. (3.6) 

Jo 

In addition, the Karush-Kuhn-Tucker (KKT) condition 

X(e -D(g \fo)) = (3.7) 

needs to be satisfied. Assume A > 0, so D(go\fo) = eo, i.e., go is on the boundary of To- Then 
(13.61 ) implies 

9o{y) = ^exp(ao<5 R (y))/ (y) (3.8) 
^o 

with 

Z = exp(l + -) , a = - . 

Note that since the nominal density fo(y) > for all y, the least-favorable density g$(y) specified 
by (13.8b is also non-negative, so that the non-negativity constraint on go is satisfied automatically. 
Proceeding in a similar manner, we find that the least-favorable density under Hi can be expressed 
as 

9i(y) = ^-exp(ai(l - S R (y)))h(y) (3.9) 
"1 

with Zi > 0. 

Together, the expressions (13.2b for <5r and (13.8b — ( f379T > for (^q ,g\ ) provide some guidelines for 
guessing a saddle point satisfying inequalities (12.7b . We exhibit below a saddle point with the 
desired structure under the following assumptions. 

Assumptions: 

i) The nominal likelihood ratio 

Hy) = Tr\ ( 3 - 10 > 

Jo(y) 

is a monotone increasing function of y. This implies that I = L(y) admits an inverse function 

y = L~ l (£). 

ii) fo(y) an( l fi(y) admit the symmetry 

h(y) = fo(-y) ■ (3.ii) 

This assumption implies 

T( \ __ h(-y) = My) 1 
1 y) M-y) h{y) L( y ) ' 

and thus L(0) = 1. 
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Remarks: 

a) The motonicity assumption for L(y) appears also in [2]. The symmetry condition (13.111 ) has 
the effect of symmetrizing the KL divergence of /o and fi, since it ensures 

D{h\h) = D{h\h) ■ 
Furthermore, for < u < 1, if we consider the geodesic 

Z(u) 

linking nominal densities /q and fx, where 

/oo 
-oo 

the assumption ii) ensures that the density / x / 2 is located mid- way between /o and fi in 
terms of the KL divergence, since 

£>(A/ 2 |/o) = D{h/2\h) ■ 

We refer the reader to [15, Chap. 4.] and [6] for a detailed discussion of the differential 
geometric structure of statistical models. 

b) For model (12.8b . the above assumptions are satisfied if under both hypotheses N admits a 
generalized Gaussian density 

f N (n) = aexp(-\n/b\ a ) 

with a > 1, where the constants a and b are adjusted to fix the variance of the distribution 
and normalize its total probability mass. The case a = 2 corresponds to a standard Gaussian 
distribution. On the other hand, if N is Cauchy distributed, it is easy to verify that 

/iv(y + i) 

is not monotone increasing so Assumption i) is not satisfied. 

c) The assumptions allow the consideration of nonsymmetric noise distributions. For example, 
consider model ( 12.81 ) where under H\, N admits the asymmetric Laplace density 

/L(n) = { CeXp( - a " } (3.12) 
J v ' \ cexp(bn) n < 

with b > a > and c = (a -1 + b~ 1 )~ l , and under Ho, N admits the flipped density n). 
Then 

fi(y) = h(y-l) and f (y)=f L (-( y + l)) 
satisfy the symmetry condition (13.1 II ) and the log-likelihood ratio 

r (b-a)y + (b + a) y>l 
lnL(y) = m(,/i(y)//o(y)) = \ 2by -l<y<l (3.13) 

[ {b - a)y - {b + a) y < -1 

is monotone increasing. Note that this property requires b > a, which ensures that the fat 
tails of fi(y) and fo(y) are located on the opposite side of the location parameter of the 
competing hypothesis. For example under Hi, the location parameter (the constant additive 
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term in (2.8)) is 1 so the fat tail extends over [1, oo), which is on the opposite side of the 
location parameter —1 of the competing hypothesis Ho. 

We can now prove the following result. 

Theorem 1: Assume that constants e, specifying neighborhoods Tj with j = 0, 1 are such 



that £q = e\ = e, where 



< e < D(/ 1/2 |/ ) • 



(3.14) 



This requirement ensures that and T\ do not intersect. Then under assumptions i)-ii) consider 
the decision rule 

i y > yu 



Sr(v) 



1 lnL(y)- 

2 L ]n£u - 




-yu <y < yu 
y < -yu , 



(3.15) 



and the least-favorable pair 

5 L (y) = 
a\(y) = 



iuMv)/Z(yu) 

% 2 fl%)fi /2 (v)/z(yu) 

f (y)/Z( yu ) 
h(y)/Z( yu ) 
^ 2 f 1 1 /2 (y)f 1/2 (y)/Z( yu ) 
euMv)/Z(yu) 



y > yu 

-yu <y < yu 
y < -yu 

y > yu 
-yu <y < yu 
y < -yu 



(3.16) 



(3.17) 



which are parametrized by yjj > and % = L(yu) > 1. Here the normalizing constant Z(yu) 
is selected such that 

I(9o) = I{9\ ) = 1 • (3-18) 
There exists a unique yu > such that 

D(cfc\fa) = D{g\\h) = e, (3.19) 
and the corresponding 5r and densities (<7o\5i) form a saddle point of minimax problem ( I2.6I ). 
Before proving the result, it is worth noting that the least-favorable LR 

L{y) 

> f y ^ yu 

(3.20) 



l 

tuHy) < i 



y > yu 

-yu <y < yu 
y < -yu 



can be viewed as obtained by applying a nonlinearity q(-) to the nominal likelihood ratio L. 
Specifically, we have 

r L/£u L>£ v 
L L = q(L) = 1 1 l v l < L < £u (3.21) 
[ £ V L L<l v l 

where the nonlinearity q(-) is sketched in Fig. Q] below. This nonlinearity is different from the 
clipping transformation obtained by Huber [2]-[4] which truncates high and low values of the 
nominal likelihood ratio. Instead, the transformation q(-) attempts to force the transformed values 
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Ll to be as close to 1 as possible, where a LR value Ll = 1 corresponds to a situation where 
observation Y = y is uninformative in terms of making a decision between Hi and Hq. 









1 













1/4/ 


iu 



Fig. 1. Nonlinearity g(-) relating the nominal and least-favorable likelihood ratios. 



Proof: Observe first that since the least-favorable LR is given by (13.201 ). the decision rule <5r 
specified by (13.151 ) has the form (13.21 ). Note that since i v l < L(y) < ijj for — yu < y < yu, we 
have 

In L(y) 
-I < \2l < 1 

for —yjj < y < yu, which ensures < 5-&{y) < 1 for —yjj < y < yjj. 

Next, with 5r given by ( I3.15I ). it is easy to verify that the least favorable densities #q and g\ 
given by ( 13.161 ) and ( 13.171 ) admit the forms (|3T8T > and ( f3T9l ) with Z = Z x = Z(y v ) and 

Qo = a\ = \niu . 

To ensure that the normalization condition (13.181 ) holds we only need to select 

~ VV ,1/2 [ VU .1/2 ( wl/2. 



Z(yu)= fo(y)dy + £^ fi'(y)fo'{v)dy + lu f (y)dy . 

J-oo J -yu Jyu 

Then if g}j{-\yu) represents the function (I3.16I ). where the parametrization by yu > is written 
explicitly, let 

D{ VU ) t D(g^.\ yu )\f ) 



1 r f°° 
lnZ( yu ) + -— luhxtu / f (y)dy 
Z{yu)t J yu 

+£jj 2 ln£u [ VO fl /2 (y)fl /2 {y)dy 
Jo 



yu 

/O I/O 

(3.22) 



denote its KL divergence with respect to the nominal density / . For yu = 0, we have 5g (-|0) = 
fo, so -D(O) = 0. Furthermore for yjj = +oo, we have + °°) = /i/2> so D(+oo) = 
D(fi/ 2 \fo), where as noted earlier the density fij 2 represents the mid-way point on the geodesic 
linking f to f v 
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Taking the derivative of D(yu) with respect to yu gives 
(ID 

dyu 



7-2, \ dz r 
dyu 



£u In 4 



fo^dy + l^hilu 



!Jr 



fl ,2 {y)fl /2 {y)dy 



yu 



-dyu 



dyu 



N{yu) dL 
Z 2 {yu) dyu ' 



(3.23) 



where 



N(y u ) = ]ne u f (y)dy 
'yu 



h{y)dy 



yu 



if l VU fl f \y)fl'\y)dy+ f°° (f (y) + iM) dy 



yu 



> 



for yu > 0. Since L(y) is monotone increasing, we have dL/dyu > in (13.231 ), so dD/dyu > 0. 
Consequently, D{yjj) is monotone increasing from D(0) = for yu = to D(f 1 / 2 \fo) for 
yu = oo. Accordingly, given e satisfying ( 13.141 ), there exists a unique yu such that D(yu) = e. 
For this choice of yu, the least favorable densities #q and g\ satisfy KKT condition (13.91 ), so 
the second inequality of ( 12.81 ) is satisfied, and <5r together with {g^ : g\) form the desired saddle 
point. □ 



Worst case test performance: By taking into account the symmetries 

1 - 8n{y) = 5 R (-y) 

g\{y) = 9o(-y) 



(3.24) 



of the robust test and least favorable densities, which are a consequence of the symmetry 
assumption ( 13.111 ), we find that the worst-case probabilities of false alarm and of a miss for 
test 5r satisfy 



where 



yu 



S R (y)g^(y)dy + Z- 1 (y u )£u h{y)dy 



yu 



yu 



Z-\yu) 



l u 



yu 



fl /2 (y)fo , \y)dy + £u I h{y)dy 

Jyu 



.1/2, 



(3.25) 



IV. Examples 

Example 1: Consider the case where under Hq and H\, Y admits the nominal distributions 



My) 
Mv) 



1 exp( ( ^ + 1)2 ) 

1/2 eXP I 9^2 I 



1 

(27T(T 2 ) 1 /2 



exp 



2a 2 

(y-i) 2 - 

2a 2 ■ 



(4.1) 
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This corresponds to a model of the form (12.81 ) where the additive noise N has a N(0, a 2 ) 
nominal distribution. The signal to noise ratio (SNR) for this detection problem is SNR = 1/cx 2 . 
The likelihood ratio 

L \y) = ti~\ = ex P \~2) 
My) e 1 

is clearly monotone increasing, and the nominal densities fj(y), j = 0, 1 admit the symmetry 
(13.11b . so the assumptions of Theorem 1 are satisfied. In this case, it is interesting to note that 
the mid-way density 

, , v _ fl'\y)fl /2 {y) _ i r:rn , y\ 

h/2{y) - - (2^)1/2 exp \ - ^ ) - 

is N(0, a 2 ) distributed, which makes sense since /q and f\ have opposite means =f1 but the same 
variance a 2 . 

If we consider the parametrization (13.161 ) of the least favorable density go(y), we find that it 
is continuous and formed by three segments. Over (— oo, —yu), do is an attenuated version of 
the nominal N(— I, a 2 ) density. Over [— yu,yu]> it is a scaled version of the mid-way N(0, a 2 ) 
density, and for (yu, oo) it is an amplified version of the nominal N(—l,a 2 ) density. Thus g$ 
can be viewed as obtained from the nominal density /o by shifting a portion of its probability 
mass to the middle segment where g$ and g\ are equal, and to the right tail where hypothesis 
Hi is selected, which has the effect of increasing the probability of false alarm. 




Fig. 2. Plot of function D{y v ) for OdB SNR. 

To illustrate the construction of g$ (y), let the relative entropy tolerance be e = 0.1. Then 
for a nominal SNR equal to OdB (a = 1), the function D(yjj) measuring the KL divergence 
of g}j{-\yu) with respect to / is plotted in Fig. |2] As expected, it is monotone increasing and 
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attains the desired tolerance value e = 0.1 for yjj = 0.6080. The least-favorable density g$ (y) is 
plotted together with the nominal density /o(y) in part a) of Fig. [3] 




y 

(b) 

Fig. 3. Least favorable density g% (y) for a tolerance e = 0.1 and a) SNR =0dB, b) SNR = lOdB. 

The three segments of the density described earlier are clearly in evidence in this plot. Note 
however that as the SNR increases, the middle segment shrinks. For example, the least-favorable 
density for a SNR value of lOdB is shown in part b) of Fig. [3] Although the KL tolerance e = 0.1 
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is the same as in part a), the deviation of #q away from /q is much smaller than for a SNR value 
of OdB. Note also that g$ is not symmetric about —1 since a fraction of the probability mass has 
been transferred from the left tail to the right tail in the direction of the location parameter 1 of 
the competing hypothesis H\. Similarly the least favorable distribution g\(y) = go(—y) transfers 
a portion of its probability mass from its right tail to its left tail. In terms of model (12.81 ). this 
means that the least favorable densities of the noise N are different under Hq and H\, since 
one tilts rightward while the other tilts leftward. In contrast, [11] requires that the least-favorable 
noise should be the same under both hypotheses. For the above example with e = 0.1 and lOdB 
SNR, the least favorable noise density is plotted in the South West corner of Figure 3 of [11]. It 
is symmetric and thus differs from the least-favorable densities obtained here. 

Finally, for e = 0.01 and 0.1, and for SNR values between and 15dB, the worst-case 
performance of the robust test <5r given by (13.251 ) is compared in Fig. [4] with the probability 
of error P E = Q(SNRV 2 ) of the maximum likelihood detector for nominal densities (14. It . 
As indicated by the figure, the loss of performance is rather spectacular. Of course, since this 
performance represents a worst case situation, it is not truly indicative of the degradation incurred 
for more benign choices of densities gj in Tj with j = 0, 1. 




nominal 

eps=0.01 

eps=0.1 



5 10 15 

SNR 

Fig. 4. Comparison of the worst case probability of error of test 8k for e = 0.01 and e = 0.1 with the ML probability 
of error for the nominal model. 



Example 2: Consider model (12.81 ) where under H\ N admits the asymmetric Laplace density 
/i(n) given by (13.121 ) with b > a and under Hq, N admits the flipped density n). Then 
the densities 

fi(y) = f L (y-i) , fo(y) = h(-(y + i)) 

satisfy the symmetry condition ( 13.111 ), and as indicated by (13.131 ), the likelihood ratio L(y) is 
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monotone increasing. In this case, the half-way density 

/1/2O) 



fl'\y)fl ,2 {y) 



with 



Z(l/2) 

cexp(-6)/Z(l/2) 

v C exp(-^H + ^)/Z(l/2) 
Z(l/2) =2cexp(-6)(l + -^-) 



1 < 2/ < 1 

|y| > 1 



a + b J 

is constant for — 1 < y < 1 and has a symmetrized exponential decay rate for its two tails. For 
yu < 1, the parametrization (13.16b of the least favorable density g$ indicates that over segments 
(—00,-2/(7) and (yu, 00) it is proportional to /o, but over [— yu,yu] it is constant since f 1 / 2 is 
constant. 

To illustrate this feature the nominal and least favorable densities are plotted in Fig. [5] for 
a = 2, b = 4, and e = 0.1. For this choice of parameters yjj = 0.3640. 




Fig. 5. Nominal asymmetric Laplace density fo and least favorable density g$ for a — 2, b — 4 and tolerance 

£ = 0.1. 



V. Conclusion 

A minimax hypothesis testing procedure has been derived for a binary hypothesis testing prob- 
lem where the actual observation density under each hypothesis is required to be within a fixed 
KL ball centered about the nominal density. The robust test applies a nonlinear transformation 
which flattens the nominal LR in the vicinity of L = 1. The least-favorable densities include three 
segments where, quite interestingly, the middle segment is formed by a section of the density 
located mid-way on the geodesic linking the nominal densities under the two hypotheses. 
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The results were derived under a motonicity condition for the LR as well as a symmetry 
condition for the two hypotheses. While the first condition is benign and appears in Huber's 
work [2]-[4], it would be desirable to remove the symmetry condition (13- lib , since this would 
open the way to the study of more general robust signal detection problems of the type discussed 
in[l]. 
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